Intermediate Layer
The intermediate layer sits between staging and marts. These models perform complex calculations and transformations but are not exposed to end users. They help keep your transformations modular and maintainable.
Example Folder Structure
models
└── intermediate
├── finance
│ ├── int_order_payments.sql
│ ├── int_daily_revenue.sql
│ └── int_refund_calculations.sql
└── marketing
├── int_campaign_metrics.sql
├── int_customer_interactions.sql
└── int_attribution_paths.sql
Structure Best Practices
- ✅ Business Domain Grouping ~ Unlike staging, intermediate models should be grouped by business function
- ✅ File Names ~ Use
int_prefix to clearly identify intermediate models - ✅ Keep It Hidden ~ These models should not be exposed to end users or BI tools
Models
Purpose
Intermediate models serve to:
- Break down complex transformations into manageable steps
- Create reusable logic that multiple marts can reference
- Improve query performance by pre-calculating common metrics
Common Use Cases
| Use Case | Example |
|---|---|
| Multi-step calculations | Revenue attribution across channels |
| Aggregations | Daily/weekly summaries used by multiple marts |
| Join simplification | Combining multiple staging models |
| Business logic | Complex categorizations or groupings |
Materialization
Intermediate models should typically be materialized as ephemeral or views:
# dbt_project.yml
models:
your_project:
intermediate:
+materialized: ephemeral
Don't create tables/views in the warehouse Get compiled into downstream SQL Reduce warehouse storage usage Ensure fresh data
Best Practices
Documentation
# models/intermediate/finance/int_order_payments.yml
version: 2
models:
- name: int_order_payments
description: Combines order and payment data for revenue calculations
columns:
- name: order_id
description: Primary key from stg_orders
Testing
Even though these models aren't customer-facing, they should be thoroughly tested:
- Primary key uniqueness
- Foreign key relationships
- Business logic assertions
- Not-null constraints
Performance Considerations
- Use CTEs for readability
- Consider materializing as a table if frequently used
- Include only necessary columns
- Use appropriate indexes
Don't create intermediate models just to create them Avoid circular dependencies Don't expose to BI tools Keep transformations focused and purposeful
Example Model
-- models/intermediate/finance/int_order_payments.sql
WITH orders AS (
SELECT * FROM {{ ref('stg_shopify__orders') }}
),
payments AS (
SELECT * FROM {{ ref('stg_stripe__payments') }}
),
final AS (
SELECT
orders.order_id,
orders.created_at,
payments.amount,
payments.status,
-- Complex business logic here
CASE
WHEN payments.status = 'success' THEN amount
ELSE 0
END AS confirmed_revenue
FROM orders
LEFT JOIN payments USING (order_id)
)
SELECT * FROM final
Keep models focused on a single purpose Use clear, consistent naming Document assumptions and business logic Test thoroughly Consider performance implications